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Abstract 

Neuron models of associative memory provide a new and prospective technology for reliable date 
storage and patterns recognition. However, even when the patterns are uncorrelated, the efflciency of 
most known models of associative memory is low. We developed a new version of associative memory 
with record characteristics of its storage capacity and noise immunity, which, in addition, is effective 
when recognizing correlated patterns. 

1 Introduction 

Conventional neural networks can not efficiently recognize highly correlated patterns. Moreover, they 
have small memory capacity. Particularly, the Hopfield neural network 1 can store only po ~ N/2hiN 
randomized iV-dimensional binary patterns. When the patterns are correlated this number falls abruptly. 
Few algorithms that can be used in this case (e.g., the projection matrix method ^2,) are rather cum- 
bersome and do not allow us to introduce a simple training principle j3|. We offer a simple and effective 
algorithm that allows us to recognize a great number of highly correlated binary patterns even under 
heavy noise conditions. We use a parametrical neural network that is a fully connected vector 

neural network similar to the Potts-glass network 7 or optical network [H|,|ni- The point of the approach 
is given below. 

Our goal is to recognize p distorted iV-dimensional binary patterns {i^^} ,M £ IjP- The associative 
memory that meets this aim works in the following way. Each pattern from the space is put 
in one-to- one correspondence with an image from a space of greater dimensionality (Sect. I^J. 
Then, the set of images {X^^} is used to build a parametrical neural network (Sect.|2Jl. The recognition 
algorithm is as follows. An input binary vector Y G is replaced by corresponding image X G 9?, 
which is recognized by the parametrical neural network. Then the result of recognition is mapped back 
into the original A^-dimensional space. Thus, recognition of many correlated binary patterns is reduced 
to recognition of their images in the space 5R. Since parametrical neural network has an extremely large 
memory capacity (Sect. |2l and the mapping process Y X allows us to eliminate the correlations 
between patterns almost completely (Sect-OJ, the problem is simplified significantly. 

2 Parametrical neural network and their recognition efficiency 

Let us described the parametrical neural network (PNN) We consider fully connected neural net- 

work of n vector-neurons (spins). Each neuron is described by unit vectors Xi — XiCi - , where an amplitude 
Xi is equal ±1 and ej. is the Z^th unit vector-column of Q-dimensional space: i = I, n; 1 < h < Q. The 
state of the whole network is determined by the set of vector-columns X — (xi, X2, ...,Xn). According 0, 
we define the Hamiltonian of the network the same as in the Hopfield model 

n 

^ = "2 ^T,,^,' (1) 
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where xj' is the Q-dimensional vector-row. Unlike conventional Hopfield model, where the interconnec- 
tions are scalars, in Eq.(l) an interconnection is a (Q x (5)-niatrix constructed according to the 
generalized Hebb rule |7], 

T., = (l-<5.,)f]frff , (2) 

from p initial patterns: 

X'' = (f5',4,...,xj:), /i=l,2,...,p. (3) 

It is convenient to interpret the network with Hamiltonian (1) as a system of interacting Q-dimensional 
spins and use the relevant terminology. In view of (l)-(3), the input signal arriving at the ith neuron 
(i.e., the local field that acts on ith spin) can be written as 



E4'^^i, (4) 
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where 

n p 

4^ = E E(^^'^n(^^^.), z = (5) 

The behavior of the system is defined in the natural way: under action of the local field hi the ith spin 
gets orientation that is as close to the local field direction as possible. In other words, the state of the 
ith neuron in the next time step, t + I, is determined by the rule: 

Xi(t ^" 1) ■^max^max; '^max — ^^max^ ' (^) 

where max denotes the greatest in modulus amplitude in (5). 

The evolution of the network is a sequence of changes of neurons states according to (6) with energy 
(1) going down. Sooner or later the network finds itself at a fixed point. 

Let us see how efficiently PNN recognizes noisy patterns. Let the distorted mth pattern X™ come 
to the system input, i.e. the neurons are in the initial states described Here is 

the multiplicative noise operator, which changes the sign of the amplitude of the vector-neuron 
X™ = x^ei-m with probability a and keeps it the same with probability 1 — a; the operator hi changes 

the basis vector e;™ G {e;}? by any other from {e;}^ with probability b and retains it unchanged with 
probability 1 — &. In other words, a is the probability of an error in the sign of the neuron ("-" in place 
and vice versa), b is the probability of an error in the vector state of the neuron. The network 
recognizes the reference pattern X™' correctly, if the output of the ith neuron defined by Eq.(6) is equal 
to X™, that is Xi — x^. Otherwise, PNN fails to recognize the pattern X™. According to the Chebyshev- 
Chernov method (lOj (for such problems it is described comprehensively in 0]-[ni, [HI) the probability of 
recognition failure is 

nQ2 



< nexp 



2p 



■(l-2a)"(l-6)^ 



(7) 



The inequality sets the upper limit for the probability of recognition failure for PNN. The memory 
capacity of PNN (i.e., the greatest number of patterns that can be recognized) is found from (7): 

^,(l-2a)2(l-6)2 
2 Inn 

Comparison of (8) with similar expressions for the Potts-glass neural network and the optical 
network |Hl,|ni shows that the memory capacity of PNN is approximately twice as large as the memories 
of both aforementioned models. That means that under other conditions being equal its recognition 
power is 20-30% higher. 

It is seen from (7) that with growing Q the noise immunity of PNN increases exponentially. The 
memory capacity also grows: it is times as large as that of the Hopfield network. For example, if 
Q = 32, the 180-neuron PNN can recognize 360 patterns {p/n = 2) with 85% noise-distorted components 
(Fig.l). With smaller distortions, b = 0.65, the same network can recognize as many as 1800 patterns 
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{p/n = 10), etc. Let us note, that some time ago the restoration of 30% noisy pattern for p/n = 0.1 was 
a demonstration of the best recognition abihty of the Hopfield model m] . 

Of course, with regard to calculations, PNN is more complex than the Hopfield model. On the other 
hand the computer code can be done rather economical with the aid of extracting bits only. It is not 
necessary to keep a great number of {Q x (5)-matrices (2) in your computer memory. Because of the 
complex structure of neurons, PNN works Q-times slower than the Hopfield model, but it makes possible 
to store Q^-times greater number of patterns. In addition, the Potts-glass network operates Q-times 
slower than PNN. 

Fig. 2 shows the recognition reliability P = 1 — Pg^r as a function of the noise intensity h when the 
number of patterns is twice the number of neurons {p = 2n) for Q = 8, 16, 32. We see that when the 
noise intensity is less than a threshold value 



the network demonstrates reliable recognition of noisy patterns. We would like to use these outstanding 
properties of PNN for recognition of correlated binary patterns. The point of our approach is given in 
next Sections. 



3 Mapping algorithm 

Let Y = 2/2, be an A^-dimensional binary vector, = {±1}. We divide it mentally into n 

fragments of /c + 1 elements each, N = n{k + 1). With each fragment we associate an integer number 
±1 according the rule: the first element of the fragment defines the sign of the number, and the other k 
elements determine the absolute value of I: 

k 

1 = 1 + ^(y. + 1) • 2^-2. 1 < / < 2^. 
1=1 

Now we associate each fragment with a vector x = ±e/, where e*/ is the Ith unit vector of the real space 
and Q — 1^ . We see that any binary vector Y G R'^ one-to-one corresponds to a set of n Q-dimensional 
unit vectors, \ = {xi,X2, ■■■,Xn), which we call the internal image of the binary vector Y. (In the 
next Section we use the internal images X for PNN construction.) The number k is called a mapping 
parameter. 

For example, the binary vector Y = (—1, 1, —1, —1, 1, —1, —1, 1) can be split into two fragments of 
four elements: (-11-1-1) and (1-1-11); the mapping parameter k is equal 3, /c — 3. The first fragment 
("-2" in our notification) corresponds to the vector —62 from the space of dimensionality Q = 2'^ = 8, 
and the second fragment ("-I-5" in our notification) corresponds to the vector +65 G R®. The relevant 
mapping can be written as Y ^ X = (—62, +65). 

It is important that the mapping is biunique, i.e., the binary vector Y can be restored uniquely from 
its internal image X. It is even more important that the mapping eliminates correlations between the 
original binary vectors. For example, suppose we have two 75% overlapping binary vectors 

Yi = i 1, -1, -1, -1, -1, -1, -1, 1 ) 
>2 - ( 1, -1, -1, 1, 1, -1, -1, 1 ) • 

Let us divide each vector into four fragments of two elements. In other words, we map these vectors with 
the mapping parameter k ~ 1. As a result we obtain two internal images Xi = (-|-ei, — ei, — ei, — 62) 
and X2 = (+ei, —62, +ei, —62) with ei e R^. The overlapping of these images is 50%. If the mapping 
parameter fc = 3 is used, the relevant images Xi = (+ei, —£5) and X2 = (+65, +e^) with ei e R* do not 
overlap at all. 



4 Recognition of the correlated binary patterns 

In this Section we describe the work of our model as a whole, i.e. the mapping of original binary patterns 
into internal images and recognition of these images with the aid of PNN. For a given mapping parameter 
k we apply the procedure from Sect.|3|to a set of binary patterns {Y^} G R^, p G l,p. As a result we 
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obtain a set of internal images {X^} G 5ft, allowing us to build PNN with n = N/{k + 1) Q-dimensional 
vector-neurons, where Q = 2^ . Note, the dimension Q of vector-neurons increases exponentially with k 
increasing, and this improves the properties of PNN. 

In further analysis we use so-called biased patterns Y^^ whose components yf are either -1-1 or —1 
with probabilities (1 + a)/2 and (1 — a)/2 respectively (—1 < a < 1). The patterns will have a mean 
activation of Yi^ — a and a mean correlation — a^, fjL ^ ^ii. Our aim is to recognize the mth 

noise-distorted binary pattern Y"^ — (si?/™, S2?/™, sa??/]^), where a random quantity Si changes the 
variable with probability s, and the probability for to remain the same is 1 — s. In other words, s is 
the distortion level of the mth binary pattern. Mapped into space K, this binary vector turns into the mth 
noise-distorted internal image X™ , which has to be recognized by PNN. Expressing multiplicative noises 
a and b as functions of s and substituting the result in (7), we find that the probability of misrecognition 
of the internal image X™' is 



n (ch(i/a'^) — ash{i>a^)) ■ exp 



2/3 



(9) 



where 



n{l - 2s)2(l - s)^ p = p[A/il - s)f, A=il + a'')[l + a^{l - 2s)]/4. 



Expression (9) describes the Hopfield model when fc = 0. In this case, even without a bias (a = 0) 
the memory capacity does not exceed a small value po « N/2\tiN. However, even if small correlations 
(a > A^^^/^) are present, the number of recognizable patterns is (iVQ!^)-times reduced. In other words, 
the network almost fails to work as associative memory. The situation changes noticeably when parameter 
k grows: the memory capacity increases and the influence of correlations decreases. In particular, when 
correlations are small {va'^ < l)i we estimate memory capacity from (9) as 



P = Po- 



z + 1 



Let us return to the example from 11 with a = 0. When fc = 5, in the framework of our approach 
one can use 5-times greater number of randomized binary patterns, and when fc = 10, the number of the 
patterns is 80-times greater. 

When the degree of correlations is high {I'a^ > 1), the memory capacity is somewhat smaller: 



P ■ 
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[{l~s)/Af. 



Nevertheless, in either case increasing k gives us the exponential growth of the number of recognizable 
patterns and a rise of recognition reliability. 

Fig. 3 shows the growth of the number of recognizable patterns when fc increases for distortions s of 
10% to 50% (a = 0). 

Fig. 4 demonstrates how the destructive effect of correlations diminishes when fc increases: the recogni- 
tion power of the network is zero when fc is small; however, when the mapping parameter fc reaches a cer- 
tain critical value, the misrecognition probability falls sharply (the curves are drawn for a — 0, 0.2, 0.5, 0.6, 
p/N = 2 and s = 30%). Our computer simulations confirm these results. 



5 Concluding remarks 

Our algorithm using the large memory capacity of PNN and its high noise immunity, allows us to recognize 
many correlated patterns reliably. The method also works when all patterns have the same fragment or 
fragments. For instance, for p/N — 1 and fc — 10, this kind of neural network permits reliable recognition 
of patterns with 40% coincidence of binary components. 

When initially there is a set of correlated vector-neuron patterns which must be stored and recognized, 
the following algorithm can be used to suppress correlations. At first, the patterns with vector coordinates 
of original dimensionality qi are transformed into binary patterns. Then they are mapped into patterns 
with vector neurons of dimensionality (72(92 > Qi)- A PNN is built using patterns with (72-dimensional 
vector neurons. This double mapping eliminates correlations and provides a reliable pattern recognition. 

In conclusion we would like to point out that a reliable recognition of correlated patterns requires a 
random numbering of their coordinates (naturally, in the same fashion for all the patterns). Then the 
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constituent vectors do not have large identical fragments. In this case a decorrelation of the patterns can 
be done with a relatively small mapping parameter k and, therefore, takes less computation time. 

The work was supported by Russian Basic Research Foundation (grant 01-01-00090), the program 
"Intellectual Computer Systems" (the project 2.45) and by President of Russian Federation (grant SS- 
1152-2003-1). 
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List of captions: 

Fig.l. Process of recognition of letter "A" by PNN at p/n — 2,Q — 32, b — 0.85. Noise-distorted 
pixels are marked in gray. 

Fig. 2. Recognition reliability of PNN P = 1 — Perr as a function of noise intensity b. 
Fig. 3. Growth of memory capacity with the mapping parameter k {s = 0.1 — 0.5). 
Fig. 4. Fall of the misrecognition probability with the mapping parameter k. 
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